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The  estimation  of  the  parameters  of  a  linear  regression  model  in 
which  the  dependent  variable  is  a  fraction  or  proportion  frequently 
occurs  in  statistical  analyses.  This  proportion  reflects  some  specific 
activity  such  as  the  proportion  of  income  spent  on  some  good  and  is  re¬ 
lated  to  a  number  of  exogenous  characteristics.  The  estimation  of  a 
proportion  relationship  is  not  always  performed  in  isolation.  The  re¬ 
lationship  of  the  exogenous  variables  to  the  remainder  of  the  popula¬ 
tion,  or  complementary  proportions,  is  usually  of  equal  interest. 

An  example  of  such  an  analysis  has  been  performed  by  McCall  in 
studying  the  movements  of  people  into  and/or  out  of  various  Income 
classes  over  time.  In  one  part  of  his  study  McCall  calculates  a  first 
order  Markov  transition  matrix  of  peoples'  irovc-.er  Is  into,  out  of,  or 
remaining  in  income  classes.  Each  cell  contains  a  transition  proba¬ 
bility  reflecting  the  propensity  of  individuals  to  move  out  of  a  given 
income  class  or  remain  in  it.  The  changes  in  the  transition  pro!  ibili- 
ties  for  each  cell  are  then  related  to  changes  in  GOT  for  the  period 

*Any  views  expressed  in  this  paper  are  those  of  the  author.  They 
should  not  be  interpreted  as  reflecting  the  views  of  The  RAND  Corporation 
or  the  official  opinion  or  policy  of  any  of  its  governmental  or  private 
research  sponsors.  Papers  are  reproduced  by  The  RAND  Corporation  as  a 
courtesy  to  members  of  its  staff. 
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£roffi  1958  to  1966.  IU  each  matrix  the  probabilities  sum  to  one  and 
each  cell  will  undoubtedly  be  affected  differently  by  changes  in  GNP. 

in  McCall's  ane.ysis  and  in  cany  analyses,  the  probabilities  or 
proportions  across  all  cells  sue  to  one  in  the  raw  data.  For  predic¬ 
tion  purposes  it  is  desirable  that  the  estimated  proportions  always 
3UD  to  one  as  well.  In  this  note  we  show  that  the  estimated  propor- 
dons  sue  to  one  and  no  constraint  is  needed  on  the  proportions  if  the 
parameter  estimates  are  BLUE.  If  the  parameters  are  not.  then  a  con- 
straint  can  be  constructed  using  Zellner's2  technique  of  Seemingly  Un¬ 
related  Least  Squares  (SULS)  to  ensure  that  the  estimated  proportions 
sum  to  one.  It  is  further  noted  that  the  results  can  be  generalised  to 
any  system  of  equations  containing  the  same  exogenous  variables  in  each 
equation  and  specifying  an  exact  linear  constraint  on  the  dependent 
variables  for  all  observations  in  the  raw  data. 

■oirnNSTRAINED  LEAST  SQUARES 

Assume  that  there  are  ,  proportions  P*  and  n  observations  on  each 

z  „  ,  _  ■.  n  Each  regression 

p,  such  that  Z  PlJ  *  1.  for  ^  3  1 . 

I  i-1  J 

equation  can  then  be  written 

(1)  Pi  ’  X6i  +  Ei  1  '  1 . * 

where  each  p^  is  a  1  X  n  vector  of  observations  <PU . Pin>  * 

.  U  ,,  v  is  an  n  X  1  vector  of  observations, 

X  .  (l,  X1 . Xs)  where  each  ^  is  an  n  x  x  ^ 

ls  ^  (s+1)  X  1  vector  of  regression  coefficients.  ct  ~  N(°>°  >• 

Wd  Zellner,  "An  Efficient  Method  of  Estimating  Seemingly^ 
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E(e.X.)  -  0  for  all  i  and  k,  E(e  e.,,)  *  0,  l  p . .  -  1  for  all  j, 
and  0  £  pi^  1  1,  all  i,  j. 

The  equations  in  (1)  represent  actually  2-1  Independent  equations 
2-1 

and  E  p  -  1-p  .  Thus  p  can  be  easily  calculated,  but  the  separate 
i-i  1  2  2 

effects  of  the  exogenous  variables  on  p^  are  omitted.  To  ensure  that 

l  p.  “  1  and  to  obtain  the  separate  coefficients  6  .  for  all  i  and  k, 
i-1  1  ik 

it  is  not  necessary  to  use  a  Lagrangean  constraint.  Estimation  of  (1) 

by  Ordinary  Least  Squares  (OLS)  yields  the  unbiased  estimate 


gj  -  (x'xrVpi ,  i-i . 


p.  -  xe  ,  i  -  l,  ....  z  . 


By  summing  the  estimated  proportions  over  all  i,  we  obtain 


Z  a-  2  2  - 

l  P,  *  l  XS.  -  l  X(X’X)'Vp 
i-i  1  i-i  1  i-i 


-  I  Pi  -  1  • 
i-i  1 

Therefore  OLS  ensures  that  the  p^  sum  to  one  regardless  of  variances  of 
each  estimate  as  long  as  the  are  unbiased.  Similarly  the  variance 
of  the  sum  of  the  estimated  proportions  is  2ero  using  the  above  result: 


(5)  Var  (  l  p  )  -  e(  £  p  -  J  p  j 
'i-1  >  'i-1  1-1  1 

■  *[(  j,  y2  *  ( j,  -.i!  - !( ,[/.)( k  -‘)] 


-  0  . 


IMPLICIT  CONSTRAINTS  ON  THE  REGRESSION  COEFFICIENTS 

The  fact  that  the  proportions  sum  to  one  implies  a  constraint  on 

6^  across  all  i  equations.  If  the  8*  are  BLUE  (and  the  are  fixed 

z  z  „ 

or  independently  distributed),  then  Z  p.  *  1  implies  Z  B..  ■  0  for 

i-1  1  i-1 

all  J.  Consider 

/  z  , .  v  Z  Z  8  3p 

(6)  d  y  p  -  i  dp  -  r  I  ajr  “j 

*i"l  i"l  1  i*l  j-1  0  j  J 

2  S 

-  I  l  #1,  “j 

1-1  j-1  -1  J 

*  2  u  6lj  «j 

j-1  'i-1  J 

-  <BU  +  ...  +  B2l)dX1  +  ...  +  (Bls  +  ...  +  B2S)  dxs 

-  0  . 

Considering  a  total  change  in  the  summation  of  the  pi  due  to  a  total 
change  in  any  exogenous  variable,  say  Xj,  yields 


(7)  0-  <Sn  +  ...  +  Bzl)  55^+  ...  +  (By  +  -.  +  B2J)  +  .- 


+(8ls+  •-  +  Sxs)  dxT 


But  dX./dX,  -  0  for  1  -  1,  ....  s;  1  4  j.  By  considering  total  changes 

z  ,  J  z 

in  Z  p.  due  to  dX,,  j-1,  ....  s,  we  obtain  the  result  that  Z  B..  -  0 
1-1  4  J  i-1  J 

for  every  j.  The  summation  of  intercept  terms,  however,  does  not  equal 


s<r>  '  P 


i  -  £  Vj 


(8) 
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n> 


and 


(9) 


l  810-  l  P  I"  I  I  B  X 

i-1  i-1  1  i-1  j-1  -> 

-  1  -  l  (  I  Bjj)  • 

J«1  '  i-1  J 


Based  on  the  above  result  of  Z  -  0  for  every  j , 


i-1 


(10) 


l  ho  *  1 


i-1 


It  is  immediate  from  (4)  that  an  exact  linear  constraint  on  the 
dependent  variables  in  the  raw  data  implies  the  same  exact  linear  con¬ 
straint  for  the  estimated  dependent  variables  across  the  z  equations. 
Further,  the  results  of  Eqs.  (6)  to  (10)  imply  that  the  8^  sum  to 
zero  across  i  and  the  B^q  sum  to  the  value  of  the  constraint  If  an 
exact  linear  constraint  exists  on  the  dependent  variables.  Moreover, 
as  will  be  seen  below,  the  parameters  may  be  constrained  to  ensure  the 
exact  linear  constraint  on  the  dependent  variables  if  the  parameters 
are  not  unbiased. 


ESTIMATION  WITH  THE  COEFFICIENT  CONSTRAINT 


If  the  parameters  are  not  BLUE,  the  p^ 
z 

nor  ma\  the  Z  S..  sura  to  2ero  for  every  j. 
~  i-1 


and  Bi0  may  not  sum  to  one, 
These  conditions  can  be 


^Anotner  proof  of  these  results  was  pointed  out  to  me  by  b.  tfron. 
Let  ei  represent  the  (jxl)  vector  of  ones.  Write  all  z  equations  p^  - 
XBi  aide  by  sitie  to  obtain  p  -  Xb  where  „p  is  n  X  z,  X  is  n  X  (s+1) , 
and  B  is  (s+1)  X  a.  It  is  then  pez  -  X8ez.  But  pez  -  en  by  equation 
(4)  and  X  •  (1,0,..., 0)’  *  en  by  definition.  This  implies  that  ce2  - 
(1,0,..., 0)  which  contains  the  above  results. 
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forced  on  the  system  however  by  treating  the  z  equations  as  a  system 
of  seeming  unrelated  regressions  as  developed  by  Zellner.  Rewrite 
(1)  as 


?1 

X 0 . ?  ‘ 

V 

'  - 

*1 

0  X2  : 

(11) 

■ 

\  j 

+ 

o  . :x 

L  2J 

K 

e 

L  z  J 

where  each  is  a  vector  of  independent  variables  (1,  X^,  Xis) 

and  each  8^  a  vector  (B.q,  8.-,  B^*  •••»  8ig);  Eq.  (10)  can  be  simpli¬ 
fied  as 

(12)  p  *  XB  +  u  . 

Application  of  least  squares  yields  the  BLU  estimator, 

(13)  B*  -  (X'r'^X'f1,  , 

4 

where  I  is  estimated  by  the  disturbance  variance-covariance  matrix. 

By  further  constraining  the  B^  to  sum  to  aero  across  all  i  equations 
for  every  j  and  the  B^  to  sum  to  one,  we  obtain  simultaneous  constraint 
estimates’* 

(14)  ?  -  B*  +  (X’X'IX)'1Q’[Q(X'X*1X)';1i),)'1(W-OB*)  . 


*See  A.  Zallner,  "An  Efficient  Method  of  Estimating  Seemingly 
Unrelated  Regressions  and  Tests  for  Aggregation  Bias,"  Journal  of  the 
American  Statistical  Association,  June  1962. 

^See  H.  The 11,  Economic  Forecasts  and  Policy  (2nd  ed.),  North- 
Holland  Publishing  Co.,  Amsterdam,  1961. 
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The  estimate  6  satisfies  the  constraint  Q~  -  W,  where  Q  is  a  matrix 
specifying  the  combination  of  tne  and  W  is  a  vector  describing 
the  constraint.  For  the  constraints  above  the  intercepts  sum  to  one 
and  the  slope  coefficients  across  each  sum  to  zero,  therefore  W  ■ 

(1,  0,  0),  a  1  x  (8+1)  vector.  Thus  all  constraints  may  be  forced 

in  the  event  that  unconstrained  OLS  is  not  consistent  with  the  implied 
constraints  on  the 


